Annals of Mathematics, 149 (1999), 871-904 



Entropy of convolutions on the circle 

By Elon Lindenstrauss, David Meiri, and Yuval Peres 



Abstract 

Given ergodic p-invariant measures {Hi} on the 1-torus T = M/Z, we 
give a sharp condition on their entropies, guaranteeing that the entropy of the 
convolution fj,% * ■ ■ ■ * fj, n converges to log p. We also prove a variant of this 
result for joinings of full entropy on T N . In conjunction with a method of 
Host, this yields the following. Denote cr q (x) = qx (mod 1). Then for every 
p-invariant ergodic \x with positive entropy, jjJ2n=o a c n f J ' converges weak* to 
Lebesgue measure as N — ► oo, under a certain mild combinatorial condition 
on {cfc}. (For instance, the condition is satisfied if p = 10 and = 2 k + Q k or 
Cfc = 2 2 .) This extends a result of Johnson and Rudolph, who considered the 
sequence Ck = q k when p and q are multiplicatively independent. 

We also obtain the following corollary concerning Hausdorff dimension 
of sum sets: For any sequence {Si} of p-invariant closed subsets of T, if 
£dim#(5j)/| logdiin H -(S'i)| = oo, then dim H (S 1 H h S n ) — ► 1. 



1. Introduction 

Let p > 2 be any integer (p need not be prime), and T = M/Z the 1- 
torus. Denote by a p the p-to-one map x i— > px (mod 1). The pair (T, a p ) is 
a dynamical system that has additional structure: T is a commutative group 
(with the group operation being addition mod 1), and a p is an endomorphism 
of it. Even in such a simple system, the interaction between the dynamics and 
the algebraic structure of T can be quite subtle; the present work continues 
the study of this interaction, inspired by the fundamental work of Furstenberg 
[8]- 

Say that a measure \i on T is p-invariant if cr p/ u = fi, where for every set 

Act 
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(All measures we consider are Borel probability measures.) Lebesgue mea- 
sure on T, denoted A, has entropy logp with respect to a p , and is the unique 
p-invariant measure of maximal entropy. Given two p-invariant measures \x 
and the group structure of T naturally yields another p- invariant measure 
— the convolution fj, * v. 

Our main results, Theorems 1.1 and 1.8, concern the entropy growth for 
convolutions of p-invariant measures and their ergodic components. These re- 
sults have applications to the Hausdorff dimension of sum sets and to genericity 
of the orbits of measures with positive entropy under multiplication by certain 
integer sequences. 

Theorem 1.1 (the Convolution Theorem). Let {p^} be a sequence of 
p-invariant and ergodic measures on T whose normalized base-p entropies hi = 
/j (/"ii cr p)/logP satisfy 

Then 

h(/j,i * • • • * Hn, Up) — ► logp monotonically, as n — > oo. 

In particular, hi * ■ • • * Hn — ► A weak* and in the d metric {with respect to the 
base-p partition) . 

It is relatively easy to see that under hypotheses of the theorem, 
Hi* ■■■ * fin — > A weak*. This means that 

J fix) dm*---*Hn — ► J fix) dX 

for all continuous /, and gives very little information on the dynamics of 
Hi * ■ ■ ■ * Hn- 

The convergence of the entropy to log p is equivalent to the much stronger 
statement that hi * ' ' ' * Hn — ► A in the d metric. As we will not use this 
metric in our arguments, we only recall its definition and refer the reader to 
Rudolph [17] for further information. Consider two p-invariant measures vi 
and V2 on T. What d{y\,V2) < e means is that there exists a p-invariant 
measure v on T 2 that projects to vi in the first coordinate and to v<i in the 
second coordinate, such that for P-almost every (x,y) € T 2 , the set of integers 
k > 1 for which the k th digits in the base p expansions of x and y differ, has 
asymptotic density less than e. Once we establish the entropy convergence, the 
d convergence of Hi * ■ " " * Hn to A is an immediate corollary of the fact that 
A is a Bernoulli measure, and hence is finitely determined; see Rudolph [17, 
Chaps. 6 and 7], for the relevant definitions and proofs. 

The entropy condition in Theorem 1.1 is sharp: if {hi} is a sequence 
of numbers in the range (0,1) with X^i/|l°g^i| < °°; then there exists a 
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sequence of p-invariant ergodic measures {pi}, such that h(pi,a p )/ log p = hi, 
yet pi* ■ ■ ■ * p n does not converge to Lebesgue measure A even in the weak* 
topology; see Example 10.2. 

The convolution theorem has implications for Hausdorff dimension of sum 

sets: 

Corollary 1.2. Let {Si} be a sequence of p-invariant closed subsets of 
T, and suppose that 

dim H (Sj) 
^ | log dirndl 

Then dim^(5i + • • • + S n ) — ► 1. 

By Furstenberg [8, III, 2], the conclusion is equivalent to h top (S\ + ••• + 
S n ,o- p ) -> log p. 

If the measures pi are not weakly-mixing, the measure p\ * • ■ ■ * p n might 
be nonergodic, with different fibers carrying different entropies (see Exam- 
ple 9.4). The reason for this is that for two measures p and v on T, the 
convolution p * v is the projection of the product measure p x v on T x T to T; 
if p and v are not weakly mixing, then p x v need not be ergodic (indeed, p is 
weakly mixing if and only if p x p is ergodic). In this case, ergodic components 
of ji x v can project to ergodic components of p * v with distinct entropies 
(however, it is easy to see that the entropy of almost all ergodic components 
of p * v is at least the entropy of p — see Corollary 9.3). In this more general 
situation, when are not assumed to be weakly mixing, it turns out to be 
both natural and important for the applications to give more accurate infor- 
mation than that provided by Theorem 1.1 regarding the ergodic components 
of ji\ * ■ • • * jjL n . This is done in Theorem 1.8 below; the proof of this more de- 
tailed result is rather delicate. As before, such a result can be used to obtain 
an estimate of Hausdorff dimension of sum sets: 

Theorem 1.3. Let {pi} be a sequence of p-invariant ergodic measures 
on T, and suppose that infj h(fii, a p ) > 0. Let {Si} be a sequence of Borel 
subsets of T, and suppose that Pi(Si) > for all i > 1. Then dim// (Si + • • • + 

S n ) — 1. 

Note that the the measures Pi(Si) can tend to zero arbitrarily fast. 

Our initial motivation for studying entropy of convolutions of p-invariant 
measures was to find conditions on a sequence of integers {c n } and a measure 

JV-l 

p, which imply that p is {c n }-generic, i.e., the averages <T c n H converge 

n=0 

weakly to Lebesgue measure A. In certain cases, this could be established for 
ergodic p-invariant measures of sufficiently high entropy, and the idea was to 
deduce {c„}-genericity for all ergodic p-invariant measures of positive entropy 
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by repeated convolutions. Indeed, we establish Theorem 1.4 below using this 
scheme. 

The works of Host [9] and Meiri [13] indicate that combinatorial proper- 
ties of {c n } can be used to prove {c n }-genericity for invariant measures with 
positive entropy. The combinatorial property we need is weaker than those 
assumed in the quoted papers; we require that the number of pairs among the 
first p n elements of {c n } which are congruent mod p n , is exponentially smaller 
than p 2n . 

Definition 1.1. Given an integer-valued sequence {c n } and an integer 
p > 1, we define the p-adic collision exponent of the sequence as 

log j{0 < k,l < p n : c k = c e (mod p n )}\ 
T P ({cn}) = hmsup — — . 

ra^oo II lOg p 

Since pairs with k = t are allowed, we always have 1 < T p ({c n }) < 2. For 
example, whenever p, q > 1 are relatively prime, the p-adic collision exponent 
of {q n } is 1. If we assume only that there is some prime factor of p which does 
not divide q, then {q n } has p-adic collision exponent < 2. (See §3.2 for more 
details and refinements of this definition.) 

Using this definition we can state our results on genericity of orbits of 
p-invariant measures: 

Theorem 1.4. Let {c n } be a sequence with p-adic collision exponent < 2, 
for some p > 1. Then any p-invariant ergodic measure fi on T with positive 
entropy is {c n } -generic, i.e., 

j JV-l 

(2) — a c n H — ► A in the weak* topology. 

n=0 

For instance, the hypothesis is satisfied if p = 10 and c n = 2 n + 6 n or 
Section 4 for other examples. In fact, the following stronger 
form of convergence holds. Recall that the space of probability measures on T 
endowed with the weak* topology is a compact metric space, and take p* be 
some metric on it. 

Theorem 1.5. Under the conditions of Theorem 1.4, (i is {c n }-normal 
in probability, i.e., 

r l N ~ 1 

(3) / P*(jjYl 5 cnx , A) dfi(x) — ► 0. 

n=0 



To be more specific, let 

jl(k) = [ e 2nikx dn 



ENTROPY OF CONVOLUTIONS ON THE CIRCLE 



875 



Then p*(fi,v) = ]T 2 \ k \\jl(k) — v{k)\ 2 is a metric on the space of all 

k=— oo 

p-invariant probability measures on T with the weak* topology. Define for 
any integer k ^ 0, 

^ N-l 

( 4 ) = f T7 ^ e(/cc n x), 

n=0 

where e(x) = f e 27 ™\ Then (2) is equivalent to for all k ^ 0, / g^\x) d/j, — ► 0, 
while (3) is equivalent to the stronger property: for all k ^ 0, / Igjy-^x)! 2 d/j, 
— ► 0. 

The case c n = q n is known, and inspired our general investigation. Even 
though there are multiplicatively-independent p and q with T p ({q n }) = 2, the 
following is still a corollary of the above results. 

Corollary 1.6 (Johnson and Rudolph [11, Thm. 8.6]). Suppose that 
p, q > 1 are multiplicatively-independent and /j, is a p-invariant and ergodic 
measure on T with positive entropy. Then fi is {q 11 } -normal in probability. 

Our proofs of Theorems 1.4-1.5 use two main tools. Host [9] developed 
a harmonic analysis method which is most powerful when the entropy of ji is 
large. The following general result then allows us to use Host's method for 
all measures with nonzero entropy by reduction (via convolutions) to the case 
where the entropy of the measure \i is sufficiently high. 

Theorem 1.7 (the Bootstrap Lemma). Suppose that C is a class of 
p-invariant measures on T with the following properties: 

(i) If fi is p-invariant and ergodic and (i* fx G C, then (i G C. 

(ii) If fj, is p-invariant and almost every ergodic component of fi is inC, then 

(hi) There exists some constant ho < logp such that every p-invariant and 
ergodic measure fj, with h(fi,a p ) > ho is in C. 

Then C contains all p-invariant ergodic measures with positive entropy. 

We derive Theorem 1.7 from a variant of Theorem 1.1 for joinings of full 
entropy: 

Definition 1.2. Let {/J>i}i>i be a sequence of p-invariant and ergodic mea- 
sures on T. A measure on T n is called a joining of //i, . . . , jx n if 

(i) is a p x • • • x a p invariant, 

(ii) The projection of z» on the i th coordinate is /Xj, for i = 1, . . . , n. 
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The measure is called a joining of full entropy of fj,i, . . . , /i n if in addition 

n 

h(u {n \a p x ••• x cr p ) = ^/i(/i i ,Op). 

i=i 

A measure /i on T N is called a joining of full entropy of {/Xj}?^ if for every 
n, the projection of fi to the first n coordinates is a joining of full entropy of 

Hi,... ,fl n . 

Theorem 1.8. Let be a sequence of p-invariant and ergodic 

measures on T such that inf j h(fii, a p ) > 0. Suppose that fi is a joining of full 
entropy of Define 9™ : T N -> T 6y 6 n (x) = xi + • • • + x n (mod 1). 

Then 

h(Q n p,,(Tp) — ► logp monotonically, as n — > oo. 

Theorem 1.8 is not valid under the weaker entropy assumptions of Theo- 
rem 1.1. Indeed, it is possible to find a joining of full entropy fi with entropies 
satisfying (1), such that Q n fi does not even converge weak* to A. See Exam- 
ple 10.5. 

1.1. Background. In Furstenberg [8] many aspects of the dynamics of 
(T, g p ) are discussed, and in particular it is shown that there are no nontrivial 
<7p-invariant closed subsets of T that are also invariant under a q for p and q 
multiplicatively independent (i.e. logp/logg ^ Q), the trivial examples being 
the whole of T and some finite sets of periodic points. Furstenberg conjectured 
the following stronger result: 

Furstenberg's Conjecture. The only ergodic invariant measures 
for the semi-group of circle endomorphisms generated by a p , and a q for p and 
q multiplicatively independent are Lebesgue measure A, and atomic measures 
concentrated on periodic orbits. 

Most of the research on the dynamics of (T, a p ) has been related to this 
conjecture. It has been proven for measures with positive entropy — a partial 
result was proved by Lyons [12], and the case of p and q relatively prime was 
settled by Rudolph [18]. The case of p and q multiplicatively independent 
but not relatively prime is harder, and was proved by Johnson [10]. Another 
argument along the lines of Lyons [12] for the multiplicatively independent case 
was given by Feldman [6]. To tackle the case of measures with zero entropy it 
seems that a totally different method is needed. 

In Feldman and Smorodinsky [7], Johnson and Rudolph [11], and Host 
[9], the measure is only assumed to be cr p -invariant, and the authors consider 
the action of a Cn on fi for the special case of c n = q n . As shown in Meiri [13], 
the methods of Host imply the following result: 
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Theorem A. For p > 1, let {c n } be a sequence with p-adic collision 
exponent 1. Then any p-invariant ergodic measure (i with positive entropy is 
{c n }-normal a.e., i.e., {c n x (mod 1)} is uniformly distributed for fi-almost 
every i€l 

This theorem gives a significantly stronger statement than Theorems 1 .4- 
1.5, but for a smaller class of sequences {c n } — and in particular gives no 
information for the case c n = q n , with p and q not relatively prime. (Recalling 

(k) / n (k) 

the definition of g N in (4), this assertion is equivalent to for all k ^ 0, g N 
— ► fi— a.e.) A more detailed history of this problem can be found in Host 
[9] and Meiri [15]. 

For the convolution results (Thms. 1.1 and 1.8), the case of measures with 
zero entropy is not interesting, as the entropy of a convolution of measures, 
each having zero entropy, has zero entropy, and hence does not converge d to A. 

Regarding Theorems 1.2-1.3, one can ask if a stronger conclusion holds, 
namely, that S\ + • • • + S n contains an interval for all n sufficiently large. 
Brown and Williamson [4] showed that if ji is a measure on T which makes 
the digits in base p i.i.d. nondegenerate variables, and fi(Si) > for all i, then 
this stronger assertion is true. However, under our weaker assumptions, this 
conclusion is not valid, even if all the sets Si coincide: By Furstenberg [8, 
Thm. III. 2] there exists a minimal p-invariant closed set S C T with positive 
dimension. By Proposition IV. 1 of the same article, the sum sets S + • • • + S 
of any finite order have Lebesgue measure zero. For more information about 
Lebesgue measure of sum sets see Brown, Keane, Moran, and Pearce [3] . 

1.2. Overview. The paper is organized as follows. In Section 2 we show 
how one can deduce the Bootstrap Lemma from Theorem 1.8, and proceed 
to prove Theorem 1.4. In Section 3 we discuss convergence in probability and 
prove Theorem 1.5. In Section 4 we derive Corollary 1.6 from Theorem 1.4, and 
discuss p-adic combinatorial properties, and in particular we give an algorithm 
for computing the p-adic collision exponent of linear recursive sequences. 

In Sections 5-7 we prove our main results, Theorems 1.1 and 1.8. The 
simplest case is proving Theorem 1.1 for prime p. In this case, however, one 
does not need to use a substantial part of the ideas behind the proof. We 
recommend for first reading to have in mind the case p = 10, with [ii = \i for 
all i for Theorem 1.1. Theorem 1.8 is interesting already when p is a prime 
(and we again recommend considering first the case where all //j are identical). 

Section 5 contains results about finite cyclic groups which are crucial to 
the proof of the convolution theorems. Lemmas 5.1 and 5.2 study convolutions 
of measures on a finite cyclic group and contain one key idea in the proof, 
namely that the convolution of a sequence of measures on a finite cyclic group 
of order N (we shall use N = p k ) tends to be invariant under a subgroup 
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that will typically be rather large (in the cases we will be interested in, this 
subgroup will be of order approximately p ak for some < a < 1). Lemma 5.3 
shows that if a measure on Z/p fc Z is almost invariant under a subgroup of size 
p ak , the distribution of the ak high order digits is nearly uniform. 

In Section 6 we begin to show how convolutions of measures on 
relate to convolutions of measures on T, where we get measures on Z/p k Z 
from measures on T by considering the conditional distribution of the first k 
digits in the base-p expansion of x £ T, given the rest of the digits. In Section 7 
we continue in this approach and prove Theorem 1.1. The basic argument is 
that if the entropy of //i *•■■*//„ is almost 

sup h(m * ... * hn,ct p ), 

TVeN 

then for any k > 1 the distribution of the first k digits of x given the rest 
of the digits (x chosen according to * • ■ ■ * /x n ) must be nearly invariant 
under a subgroup G C T,/p kr L of size p ak — for if it is not, then the entropy 
of \i\ * ■ ■ ■ * \i n can be significantly increased by further convolutions. This 
implies that the first ak digits of x are distributed nearly uniformly. Since k 
is arbitrary, it follows that h(fi\ * ■ ■ ■ * fi n , o~ p ) ~ logp. 

In Section 8 we prove Theorem 1.8. The main observation is that if one 
chooses an element in T N according to a joining of full entropy, the components 
of this vector are almost independent (Lemma 8.1). This allows us to prove 
Theorem 1.8 along similar lines as Theorem 1.1. In Section 9 we use the 
connection between entropy of measures and Hausdorff dimension, to derive 
Theorems 1.2-1.3 from the results about convolutions of measures. Section 10 
contains some concluding remarks, questions and examples. 

2. Proof of the Bootstrap Lemma and Theorem 1.4 

In order to derive the Bootstrap Lemma from Theorem 1.8, we need the 
following lemma: 

Lemma 2.1. Let fi denote a p-invariant and ergodic measure on T, and 
let (i x fj, = J v 1 d'y denote the ergodic decomposition of fx x fi. Then v~ f is a 
self-joining of fx for a.e. 7, and h(v 7 , a p x a p ) = 2h(fx, a p ) a.e. 

Proof. By ergodicity of [/,, almost-surely projects to (i in both coordi- 
nates. Obviously, h(vj, a p x a p ) < h([x x /j,,a p x a p ) = 2h([i, a p ). On the other 
hand, by Rokhlin's theorem, h(fi x /x, a p x a p ) = f /i(z/ 7 , a p x a p ) d'y, therefore 
h(vj,a p x a p ) = 2h{jjL,a p ) a.e. □ 
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Proof of Theorem 1.7 (the Bootstrap Lemma). Let /j be some p-invariant 
and ergodic measure with positive entropy. We claim that \i G C. Suppose 
that this is not the case, and write v^> = f \i and h d = h(fi,a p ) > 0. From 
property (i) of C, also jjl * ji ^ C. Let fx x ji = f z/ 7 dj denote the ergodic 
decomposition of fi x fi with respect to a p x a p . Then 

/i * ijl = Q 2 (p x //) = J 6 2 (^ 7 ) 07, 

where as before fc (xi, . . . , x^) d = x± + ■ ■ ■ + x^ (mod 1). From property (ii), 
for a set of 7 with positive measure, 9 2 (^ 7 ) ^ C. By Lemma 2.1, there exists 
an ergodic component of ^ x ju, which we designate z^ 2 **, such that is 
an ergodic self-joining of fi with entropy 2h and 2 (z/ 2 )) ^ C. Apply now 
the same procedure to z^ 2 -*, finding an ergodic component of x z/ 2 ), 
such that z^ 4 ) is a self-joining of with entropy Ah, and 4 (z^ 4 )) ^ C. 
Continuing this way we obtain a sequence of measures {v^}, defined for k 
a power of 2, such that fc (z/ fe )) ^ C, z^ fc ) is an ergodic self-joining of fx in 
T k , and h{y^ k \a p x ••• x cr p ) = /c/i. Define for other values of k by 
projection, and let fi be the inverse limit of these measures, defined on T N . 
Then fi is a joining of full entropy. Applying Theorem 1.8 we conclude that 
h(Q k v( k \o p ) = h(@ k jl,a p ) — > logp. As @ k fl £ C for k a power of 2, this 
contradicts property (iii). □ 

We next deduce Theorem 1.4 from the Bootstrap Lemma. Given an 
integer- valued sequence {c n }, say that a measure /i is {c n }-generic if 

1 

— ^2 °~c n H — ► A weakly, asiV — ► 00. 

The basic tool we use is the following observation in Meiri [13, Thm. 3.1, and 
note in §8, problem 3], based on Host [9]: 

Proposition 2.2. Fix an integer p > 1 and a sequence {c^} with p-adic 
collision exponent < 2. Then there exists a constant ho < logp such that every 
p-invariant and ergodic measure fi with h(/j,,a p ) > ho is {c n }-normal a.e.; in 
particular, fi is {c n }-generic. 

Remark. This proposition, as well as Theorems 1.4-1.5, hold under the 
weaker assumption that the reduced p-adic exponent of {c^} is less than 2; see 
Section 3.2 for the definition and more details. 

The following lemma is a consequence of the extremality of ergodic mea- 
sures: 
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Lemma 2.3 (Johnson and Rudolph [11]). Suppose that v, v\, V2, ■ ■ ■ are 

invariant measures, and that v is ergodic. Suppose also that 

1 N 

JqYL Un — * ^weakly. 

n=l 

Then there exists a zero-density set J C N such that v n — > v (weakly) as 
n — ► oo. 

Proposition 2.4. An invariant measure fi is {c n }-generic if and only if 
/j,* (i is {c n } -generic. 

Proof. By the previous lemma, fi is {c n }-generic if and only if there exists 
a zero-density set J C N such that a Cn fi — ► A as n — ► oo. This is equivalent 

to 

(5) lim fi(ac n ) = 0, for all a G Z \ {0}. 

n<£J 

Since Jf*~j2 = £l 2 , equation (5) holds if and only if it holds when \i is replaced 
by fx * (i. □ 

Proof of Theorem 1.4. From Propositions 2.2 and 2.4 it follows that the 
class of p-invariant measures which are {c n }-generic satisfies the conditions of 
the Bootstrap Lemma (Theorem 1.7), and the assertion follows. □ 

3. Convergence in probability and proof of Theorem 1.5 

Recall the definition (4) of gffi. Given a measure /U, define fi # by n#(A) = 
fi(-A). 



Lemma 3.1. For any measure fj,, 

J \gP(x)\ 2 d^< (/ |#(x)| 2 <W # 



Proof. As = |A| 2 , by Cauchy-Schwarz we have 

N-l „ -, N-l 



\g$(x)\ 2 dn = J e iKci ~ c m )x) d^l = ^ (i(k(ci - c m )) 

m,l=0 m,l=0 

1 / N_1 \ ^ / r 

< jj E \m*-cm))\ 2 \ ={J\gP{x)\ 2 dvL*to 



im,l=0 



□ 
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Proof of Theorem 1.5. A slight change in the proof of the Bootstrap 
Lemma lets us replace the first condition by the following: 

(i') If jU is p-invariant and G C, then \x € C. Taking C to be the class 

of p-invariant measures for which (3) holds, condition (i') is satisfied by the 
last lemma, (ii) is immediate by Lebesgue dominated convergence, while (iii) 
follows from Proposition 2.2. □ 

The type of convergence in Theorem 1.5 is, in general, stronger than weak 
convergence: 

Example. Let c n = 2 2 " . For j = 0, 1 consider random variables X^ = 
x*p2~ l , where x^} = j for all k, and all other digits are i.i.d. uniform on 
{0, 1}. Let jUj be the distribution of Xj, and take (i = + 1^2) ■ Then \i is 
{2 2 ™ }-generic, but is not {2 2n }-normal in probability. 

However, there are cases where the stronger type of convergence follows 
from the weaker one. The following proposition was obtained by Johnson and 
Rudolph [11, §8] using general convex analysis. Here we give a more direct 
argument. 

PROPOSITION 3.2. Let q > 1. Then if [i is {q n }-generic, it is also 
{q 11 } -normal in probability. 



Proof. For any L, by Cauchy-Schwarz, 



-, N-l 

1 \ - (k) 

n=0 



2 -. N-l 

^ 1 V~"* I (fe)i2 



n=0 



and so by (2) applied for c n = q n , 

N-l 2 



lim sup 

N^oo 



f 1 

J N ^ ' 

J n=0 



(k) 

9 L a q n 



N-l 



dfi < lim sup— V [ \g < L ) \ 2 a q n dfi 

N n =0 J 



Since 



(fc) 



5jv 



1 N ~ 1 
J_ V „(*) 

TV 



E(k 
9l <V 



n=0 



< 



2L 

iV' 



we conclude that for any L, 



lim sup f l^l 2 d/j, < — 

N^oo J L 



it follows that \x is {(7 ra }-normal in probability. 



□ 
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4. The p-adic collision exponent 



Denote by ordoix) the order of an element i in a finite group G. The 
following statement was noted by Host [9]: 

Proposition 4.1. // p,q > 1 and p,q are relatively prime, then there 
exists a > such that 

for all n > 1, ordx/ p nz(q) > ap n . 
In particular, in this case the p-adic collision exponent of {q n } is 1. 



Proposition 4.2. Suppose that p, q > 1 and i/iere exists some prime 
factor p* of p that does not divide q. Then the p-adic collision exponent of 
{q n } is < 2. 



Proof. Define o n = f ord^ / p %z(q), and use Proposition 4.1 to find a > 
such that o n > ap™. If q k = q e (mod p n ) then p"|q' fc ~ £ ' — 1; hence o n \k — £. 
Thus 

{0 < k,£ < p n : q k = q £ (mod p n )} < p n • (p n /o n ). 



Hence 



r P ({<? n }) = hmsup 



log {0 < k,£ < p n : q k = q e (mod p n )} 



nlogp 



log(p" ■ (p n /o n )) logp* 
< hmsup ; < 2 — ; < 2. 



n logp 



logp 



□ 



Proof of Corollary 1.6. By Proposition 3.2, it is enough to prove weak 
convergence. If some prime factor of p does not divide q, we are done by 
the last proposition and Theorem 1.4. Otherwise, note the following simple 
observations: 

(A) If the theorem holds replacing q by some power q e , then it holds for q as 
well. The reason is that we can decompose 

^ Nt-i ^ e-i ^ N-i 

n=0 k=0 n=0 

and apply the theorem to the measures cr q kfi, for k = 0, . . . , £ — 1. 

(B) If p\q and the theorem holds for p and q/p, then it also holds for p and 
q. This is because o- q / p /j, = o q \x for p-invariant fj,. 
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For any multiplicatively independent p, q we can find some k, I such that 
p k \q e and some prime factor of p does not divide q' = q £ /p k , and use the above. 

□ 

It is easy to see that Theorems 1.4-1.5 are not affected if we only assume 
that {c n } differs on a set of arbitrarily small density from a sequence with p- 
adic collision exponent smaller than 2. This motivates the following definition: 

Definition 4.1. The reduced p-adic collision exponent of a sequence {c n } 

is 

r p({ c n}) = lim inf ■ r p ({<4», 

where {c' n } ranges over sequences which agree with {c n } on a set of indices 
with density > 1 — e. 

We always have 1 < F' p ({c n }) < T p ({c n }) < 2. The sequence c n = n e 
for £ > 2 has p-adic collision exponent 2(1 — i^ 1 ), while its reduced collision 
exponent can be seen to be 1. 

Computation ofV ({c ra }) for a linear recursion. We conclude this section 
with an algorithm for computing the reduced g-adic collision exponent of any 
linear recursion sequence and any integer q > 1. Let {c^} be such a sequence, 
i.e., for certain integers ao, ■ ■ ■ , a n _i (ao / 0) we have 

(6) for all k > n, c k + a n ~\Ck-i + a„_ 2 Cfc-2 H h a c k - n = 0. 

Denote by f(x) = x n + a^-ix™" 1 + • • • + a\x + ao the recursion polynomial 
of (6). We can assume that / is of minimal degree. If {c^} is constant along 
some arithmetic progression, surely T' q ({c n }) = 2. Call / nondegenerate if the 
only sequence {c^} satisfying / and having a constant arithmetic subsequence 
is the zero sequence. We quote the following results: 

Theorem 4.3 ([13, Thms. 5.1-5.2]). Let f denote a linear recursion. 

(i) / is nondegenerate if and only if no roots of f, or their ratios, are roots 
of unity. 

(ii) Let {ck} be a sequence of integers satisfying f, and suppose that {c^} has 
no constant arithmetic subsequences. Then T' q ({ck}) = 1 for any q > 1 
relatively prime to /(0). 

Note. In fact a stronger result is proved there: after discarding a set of 
arbitrarily small density, {c^} has bounded cells, i.e., 

sup max |{0 < k < q n : c^=t (mod q n )}\ < oo. 

n 0<t<q n 
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For proofs of the results in the rest of this section, see Meiri [15]. 

Call a sequence {ck} an intertwining of the sequences {cj^}, . . . , {cjj.^} if 
(r) 

c N(k-i)+r = c k f° r all A; > 1 and r = 1, . . . , N. 

LEMMA 4.4. If {c^} is an intertwining of the sequences {cjjj^}, . . . , {cjj^}, 

(7) r 9 ({ Cfc })= max . r,({4 r) }), 

l<r<-/V 

and a similar result holds for the reduced collision exponent. 

For a polynomial f(x) = x n + a n ^\x n ~ x + • • • + a\x + ao, define 

gcd[/] =gcd(a ,ai,...,a n _i). 

Theorem 4.5. Lei {c^} be a linear recursion sequence, and suppose that 
its minimal recursion polynomial f is nondegenerate. Decompose q = q\q 2 
such that q\ is the maximal factor of q that is relatively prime to gcd[/]. Then 

(8) !*,({»}) = 2-^1 = 1 + ^. 

In particular we have V ({c&}) = 1 if and only if no prime factor of q divides 
gcd[/], and ^' q ({ck}) = 2 if and only if all prime factors of q divide gcd[/]. 

Algorithm. Computing the reduced g-adic collision exponent of a linear 
recursion sequence. 

1. Compute the minimal recursion polynomial / of {c^}. 
Method: solve linear equations on the coefficients of /. 

2. If /(— 1) = 0, consider separately {c2k} and {c 2 k+i}~- each satisfies a 
recursion of lower degree; apply Lemma 4.4. 

3. If /(l) = 0, /'(l) / 0, find constants r,s such that the minimal polyno- 
mial of {ret — s} does not vanish at 1, and replace {c^} by the latter 
sequence. The minimal polynomial of the new sequence does not vanish 
at 1. 

Method: find a rational number s/r such that {c^ — s/r} satisfies 
/(*)/(* -1). 

4. If /(l) = /'(l) = 0, then T' q ({c k }) = 1. 

5. Check if / is an intertwining of sequences, satisfying shorter recursions. 
Method: find the maximal number D such that <p(D) < n, where (p is 
Euler's totient function. For d = 2, . . . ,D, display {c k } as an intertwin- 
ing of d subsequences, and compute their minimal polynomials. If for 
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some d all the resulting polynomials have degree less than deg/, apply 
Lemma 4.4. If not, then / is nondegenerate. 

6. If / is nondegenerate, apply (8) to compute T' q ({c k }). 
Examples. 

1. The sequence 4,9,39,219,... which satisfies the recursion c k = 
7ck-i — 6c£._2 has minimal polynomial f(x) = x 2 — 7x + 6. In step 3 we see 
that /(l) = 0; indeed, c k = 3 + 6 k and so T' q ({c k }) = T q ({c k }) = T q ({6 k }) = 

1 + lo f gg 3 ; assuming that 2 m ||g, 3^||(7. On the other hand, a sequence {c k } 
whose minimal polynomial is (x — l)f(x) satisfies r' ({c k }) = 1 for every q > 1. 

2. Suppose that f(x) = 1 + x 3 + x 6 is the minimal recursion of {c k }- 
Then {c k } is an intertwining of 3 sequences, each satisfying the Fibonacci 
recursion F k = F k _\ + F k _2- If either of these sequences is identically zero, 
then T' q ({c k }) = 2. Otherwise, T' q ({c k }) = 1. 

3. If the minimal polynomial / of {c k } is nondegenerate, and some prime 
factor of q does not divide gcd[/], then T' ({c k }) < 2. 

4. Suppose that c k = ri(k)q k + . . . + r^(k)q^, where rj are polynomials. 
Suppose also that there exists some prime factor of q that does not divide some 

Then T' q ({c k }) < 2. 

5. Uniform distribution in subgroups 

The next three simple lemmas are the key to proving Theorem 1.1. 

Lemma 5.1. Let {X n } be an infinite sequence of independent random 
variables with values in TL^ = f Z/iVZ, for some fixed integer N > 1. Suppose 
that for some nonzero g £ Zjy, 

oo N-l 

( 9 ) Yl Yl min { p (^j = x), P(Xj = x + g)} = oo. 

j=l x=0 

Let S n = X\ + • • • + X n (mod N). Then for any x £ Zjy, 

(10) ^lim (p(S n = x + g)- F(S n = x)j = . 

Proof. We first prove the following property of Fourier coefficients of S n : 

(11) for all £eZ N , gl £ (mod N) =^ lim S n {£) = 0. 
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For some x £ 



min{P(X j = x),F(Xj = x + g)} = oo. 

3=1 

Set pj = mm{¥(Xj = x),F(Xj = x + g)}. Denoting ip e (t) = exp(2iri£t / N) , we 
have 

n JL 
S n (£) = BMJ:X J ) = l[BMX J ). 

3=1 



3=1 



Write 



\vMXj)\ = 



N-l 



< Pj 



Pj 



J2 nxj = k)Mk) 

k=0 

ipe(x) + tpe(x + g) 



1 + exp(2nilg/N) 



+ (1-Pj) 



Since iV f ^, we have |l+exp(27ri£(//JV)|/2 < 7 < 1, for 7 d = |l+exp(27rz/A0l/ 2 - 
Hence 

|E^(X,)|<1-(1-7)P,- 

00 n 

By our assumptions, X) (1 — j)pj = 00; hence lim FJ |E^(Xj)| = 0, and 
3=1 n ~^°°j=i 

(11) follows. 

To prove (10), use inverse Fourier transform to write 



1 



N-l 



F(S n = x + g)-F(S n = x) = -J2 S n(Z) ( <P-e{x + g) - <p-e(x) 



If N\g£ we have <p-e(x + g) - cp-e(x) = 0. UN] g£, apply (11). 



□ 



Lemma 5.2. Zei {X n } be an infinite sequence of independent random 
variables with values in Zat == Z/A/"Z, for some fixed integer N > 1. Suppose 
that there exists a subgroup G C Zat, generated by gi,...,g r , such that (9) 
ZioWs /or 51 = gi,...,g r . Let S n = X\ + ■ ■ ■ + A" n (modiV) , and Zei 5 n mod G 
denote the projection of S n to Zat/G. T/ien 

EH(S n \S n mod G) — ► log|G|. 



Proof. Let go € G be a generator of G. Applying Lemma 5.1 for g = 
gi,...,g r , we see that (10) holds for g = g$ as well. Denote for a moment 
S n = Sn m od G. If G = Zat, the result is clear: £ n is constant, while the 
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distribution of S n converges to the uniform distribution on Zjv, whose entropy 
is log |Zjv| = log \G\. In the general case, write 

EH(S n \S n )= Yl nS n = t + G)H(S n \S n = t + G). 

t+G& N /G 

By (10) for g = go, the conditional distribution of S n on every coset t + G 
converges to the uniform distribution whose entropy is again log |G|, and we 
are done. □ 

In the next lemma we restrict our attention to N = p k and identify Z k 
with fc-digit numbers in base p. For n < k, let ir n : Z k — ► Z p « denote the 
projection to the n most-significant digits, i.e., n n (z) = \_z/p k ~ n \. 

Lemma 5.3. Let Y be a Z p k-valued random variable. Let G be a subgroup 
of7i p k, and suppose that n satisfies p n > \G\. Then 

H(ir n (Y)) > H(Y\Y mod G) = H(Y) - H(Y mod G). 

Proof. In every coset t + G, the projection 7r n is one-to-one. Thus n n (Y) 
and Y mod G determine Y; hence H(ir n (Y)) + H(Y mod G) > H(Y). □ 

Finally, we state the following simple lemma that will be used to show 
that the subgroup G we get in 5.1 will be large. For t = YJi=i tiP k ~ % € 7*/p k Z 
we call U the i th digit of t. 

Lemma 5.4. Let t e G C Z/p k Z, and suppose that the i th digit of t is 
nonzero. Then t generates in G a subgroup of order at least p\, where is the 
smallest prime divisor of p. 

We leave the proof of this lemma to the reader. 

6. Entropy and subgroups 

Notation. Suppose that ft is a o~ p x o~ p x • • -invariant measure on T N . 
For x € T N we denote its projection on coordinates by superscripts, e.g., 
x a...b _ ^a.-.bf^ denotes projection to coordinates a, Similarly, x a = 

x a...a _ yj-a^^ identify numbers in T with their base-p expansion. In this 
identification, a p is equivalent to a shift map on {0,...,p-l} N , and we denote 
by x\ d the d — c + 1 digits starting at point c in the base-p expansion of x l . 
For every i let a\ denote the partition of T N according to the first digit of x % , 
and we use the notation 

<:i = V V 

i=aj=c 
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for the partition of T N according to the digits c, . . . , d of x a , . . . , x b . By a slight 
abuse of notation a\ also denotes the analogous partition of 7r*(T N ). Recall 
the definition of the function Q n : T N — > T as the sum mod 1 of x 1 , . . . , x n . We 
denote the partition of T N according to the first k digits of O n (x) by 9™ k , etc. 

We also use the following notation for a measure fi and a finite partition 
7: we denote by D^j) the probability vector {//(C) : C <G 7}, and by H^ip/) = 
H(D^)) its entropy. Similarly, given a cr-algebra 25 we denote by D M (7|03) 
the conditional distribution of 7 given 53, and by i?^(7|23) = -ff(D^(7|23)) its 
entropy; note that this is a function on the measure space, not a constant. 

In our context it is natural to view x\ k as an element of Z/p k Z, and thus 
if G is a subgroup of Z,/p kr L, it makes sense to consider xf k mod G or the 
corresponding partition of 7r a (T N ), which is a\ k mod G. 

Proposition 6.1. Let fi, v be two p-invariant measures, and G a sub- 
group ofL/p kr L for some k G N. Then 

EH^ v (x 1 _„ k mod G|x fc+ i...oo) > E# M (xi... fc mod G|:c jfc+ i... 0O ). 

Proof. Define independent random variables X, Y with X ~ and V ~ za 
Then X + V ~ // * v. Denote by a, (3, 7 the partitions according to the first 
digit in the base-p expansion of X, Y and X + Y, respectively. We have 
(12) 

EH^ u (x 1 ... k mod G\x k +i...oo) = Efl /ixi/ (7i... fe mod G|7 fc+ i... 00 ) 

> Eff /ixi/ (7i... fc mod Glafc+i.,.00 V A...oo)- 

However, given all we have conditioned upon on the right-hand side of (12), 
ot\...k mod G uniquely determines 71.. mod G, and vice versa. Hence 

H^xAli-k mod G\ V/3i...oo) = H tixv (ai.„ k mod G\ V/3i. ..00; 

= i? /1Xi /(ai... fe mod G|a fe+ i...oo), 

using the independence of a and /? in the last equality. This gives the required 
inequality. □ 

Corollary 6.2. Let {fii} be a sequence of p-invariant measures, and 
denote ft = Yl/J-i- Suppose that G is a subgroup of 'L/p k 'L for some k G N. 
Then 

Ei^L fc modG|Cfi...oo) 
is monotone nondecreasing in n. 

Lemma 6.3. Let \i be a measure on T, and suppose that G C Z/p k Z is a 
group of size > pr . Then 

#(«!...,) >(£-l)logp- log |G|+ [ 
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Proof. Let n = [log |G|/logp]. Applying Lemma 5.3 in each fiber we get 
H(a 1 ... n ) > J H(a 1 ... n \a k +i...oo) dn> J H {&!... k\ot\...k mod G V ak+\...oo) dfx. 
Since n > £, we also have 

nlogp - H(a 1 ... n ) > ilogp - H(a 1 _ e ). 
Now combine the two inequalities. □ 

Finally, we quote the following extension of the Borel-Cantelli Lemma 
([2, ex. 22.4]). 

Lemma 6.4. Let Yi,Y2,... be a sequence of nonnegative bounded in- 
dependent random variables. Suppose that J2^^i = 00 • Then J2^i = 00 
almost-surely. 

7. The Convolution Theorem 

In this section we prove Theorem 1.1. Fix p, and consider the function 

(13) W) =H {1-P,-P-,...,-?-). 

p — 1 p — 1 

Clearly tp : [0, 1 — |] — > [0,logp] is increasing, onto and concave. A quick 
calculation shows that the inverse function satisfies 

(14) i>-Hh)>C 



| log (ft/ log p) I 

for some constant C and for all ft sufficiently small. 

Given a probability distribution v on {0, . . . ,p — 1} define 



Ml = max v(k). 
1 llo ° fc=o,..., P -i v ; 



It can be easily verified that 



^ p-ip-i 

(15) 1 - II^H^ < - ^ min { u i x )^ v { x + 5 m °d p)} 
and also 

(16) #(*,)< v(i-IHoo)- 

Let ^ denote an nonatomic measure on T, not necessarily invariant. For 
x G T denote by x = J2 xip" 1 its expansion in base p, and define 

ifc(/i)(x) = -log//(ari„. fe |x fc+ i... 00 ). 
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.k\ x k+i...oo)- Similarly, given a subgroup G C Z„k — 

Z/p k Z define 

Jg(aO( x ) = -log^(^i...feki...fe mod GVit+Loo). 
Then = 7 G for G = Z p fc . We remark that 



where S G d = E <Vp* • 

Lemma 7.1. I>e£ G C Z p k, and suppose that (i and v are nonatomic 
measures on T. Then 

EI k {fi * u) - E4W > EI G {fi * i/) - E/ G (/i). 

Proof. Rewrite the desired inequality as 
(17) EI k {fi*v)-EI G {fi*v) >EI k (n)--EI G ( f i). 

We have 

E4(j«) -E7 G (/x) = Efr„(xi...fc|x fc+ i... 0O ) 

- Ei7 /1 (xi... fe |xi... fe modGVa;jt + i... O0 ) 
= Efl M (xi...j fc mod G|x fc+ i...oo). 
Applying the same consideration to \i * v, we see that (17) is equivalent to 

'EH fl * v (xi. mm k mod G\xk+i...oo) > E# M (xi... fc mod G\xk+i...oo), 
which follows from Proposition 6.1. □ 

Lemma 7.2. Let {/x n } be a sequence of probability measures on T, and 
form the product measure jl = \[ /ij. Suppose that for some fixed number k, 



oo. 



(is) E E ^ -1 (#« (4l4+i..., 

i=i 

Then for ft- almost- every t £ T N i/iere exists a group Gk{t) C Z p fc smc/i £/iai 

(19) E/ Gfc(t) (/ii * • • • * fi n ) — > E log |G fc (t)| . 

Furthermore, the map t i— > G&(i) is measurable, and \Gk(t)\ > p\ a.e., where 
p* is the smallest prime factor of p. 

Proof. We claim that 

(20) 

oo p— 1 p— 1 

EE E min {/ i ™( x fel*fc+i...oo)^ri( a; fe +5fc modp|t£ +1 ...oo)} = oo fl-a.e. 

n=l x k =0 g k =l 
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Indeed, by Lemma 6.4, for every t in a set of full jl measure, 

oo 

£Vr 1 (ff w (4l4 h i...oo)) =00. 

1=1 

By (16), for all such t we have 

E(l-||^„K|^ +1 ... 00 )|| 00 )=oo. 

n=l V 7 

Using (15), we conclude that a.e. t satisfies (20). 
Define for t G T N a group G fc (t) by 

00 p k -i 

^ ^ min{Mn(a:|ife+i...oo),Mn( ;E + 5 mo dp fc |^+i...o )} = 00 



G fc (t) = ( g G Z pfc 



To prove the lemma, take some t satisfying (20). We first show that 
\Gk(t)\ > P*- There exist distinct x k ,yk G {0,...,p— 1} such that 

00 

£A*n(Zfc|tfc+l...oo) = OO, 
n=l 

and similarly for y^.. Since 

Mn( a; k|*fc+l...oo) = ^ n O c l*fc+l...oo) > 

x — 0,...,p^ — l 
x = x^ ( mod p) 

there exists x G {0, . . . ,p k — 1} with x = (mod p) such that 

00 

/*n(s|*fc+l...oo) = °°- 

n=l 

Similarly, there exists y G {0, . . . ,p k — 1} with y = yt (mod p) and 

00 

X] A*n(y|*fc+l...oo) = °°- 
n=l 

By definition, y—x G Gk(t). But the least significant digit oiy—x is y^—Xk 7^ 0. 
By Lemma 5.4, |Gfc(t)| > p*. 

It remains to prove (19). Clearly, lG k {t){p>\ * • • • * /x n ) < log But 

E/ Gfc (t)(Mi ♦■■■*A*n) = Efl- A ((9?... fc |e?... fc modG fc (t)V(9J? +1 ... 00 ) 

> E, Hjx(6i k \9i k mod G fc (t) V 4+?... 00) 
— Elog|G fc (t)|, 

by an application of Lemma 5.2 for the random variables X; L ~ (a\ ^k+i oo)- 
This concludes the proof of (19). □ 
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Lemma 7.3. Under the assumptions of Lemma 7.2, define 

(21) h k = supyEH tll *...* fln (x 1 _ k \xk+i...oo)- 

n K 

For anym, if £E# M1 *...* Mm (xi... fe |x fe+ i... 0O ) > h k - e, then 

(22) # w *...* Mm (*!...*) >(£-!) logp - (k + l)e, 
where £ = £(k) = [k^\. 

Proof. We have \Gk{t)\ > p* > p l almost-everywhere. By Lemma 6.3, 

(23) H lxl *...^ m (t 1 ... i ) > (£- \)\ogp- J log |G fc | - I Gk ((n * ■ ■ ■ * (j, m ) dfi. 

By Lemma 7.2, there exists some n > m such that E Ic k {^i *■•• * fJ- n ) > 
E log \ Gk\ — e. Applying Lemma 7.1 we get 

ke > E I k (m * • ■ ■ * n n ) - Eifc(//i * • • • * Hm) 

> E I Gk (fl! * • • • * H n ) - E I Gk (//!*•••* 

> E log|G fe | -£ - E7 Gfc (/xi * ■ •• *n m ). 

By substituting this into (23) we obtain the desired result. □ 

Proof of Theorem 1.1. By convexity of , 

OO OO / v 

^E^- 1 (^ n (t fc |t fc+ l...oo)) > ^2^~ 1 { EH ^( t k\tk + l...oc)) 
n=l n=l ^ ' 

OO 

= X] "0 _1 (^(Mn,CTp)) = OO, 
n=l 

applying (1) and (14). Hence the assumptions of Lemmas 7.2-7.3 hold. 

By Proposition 6.2, h([ii * • • • * [x n , a p ) is monotone nondecreasing in n. 
Define 

h d = lim h(/j,i * ■ • • * Li n ,°~p) = sup/i( / ui * • • • * [i n ,a p ). 

We need to show that h = logp. 

The h k defined in (21) satisfy h k = h for all k, by invariance of fx± * • ■ ■ * /x n . 
For arbitrary e, let m be big enough so that h(/j,i * ■ ■ ■ * /j, m , a p ) > h — e. By 
Lemma 7.3, (22) holds for every k, with t(k) = L^t^J- Hence 

(l(fc)-l)logp-(fc+l)e lQgP 

^(A*i * • • • * Mm, o-p > hm — = logp - e- . 

fc^oo l[k) logp* 

Letting e [ proves the theorem. □ 
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8. Joinings of full entropy 



In this section we study basic properties of joinings of full entropy, and 
prove Theorem 1.8. 

Let a a --- b d = af --^, and denote by T l = f Aj^i a ) oo the tail cr-algebra of 
7r*(T N ). We denote by T the cr-algebra \J™ =1 (f\f =1 

The simplest example of a joining of full entropy is a product measure 
p = Will. The following lemma shows that in the general case, a similar 
independence property holds. 

Lemma 8.1. Let p be a joining of full entropy of {ni}^2 =1 . Then the 
random variables {x 1 } are conditionally independent given T . 



Proof. For any k, n > 1 we have 

(24) f ^ A (ahfel«fe+L..oo) dp < [f2 H M...k\4+i...oo)dP 
J J i=i 

/n 
E#«( a i...fcl4+i...oo) d/H ■ 
i=i 

n 

Since h(7r 1 '" n (p), a p x • • • x o~ p ) = h(ni,o~ p ), both inequalities must in fact 

i=i 

be equalities. Equality in (24) implies that the random variables {x\ fe }" =1 
are independent given a\^l oc . A reverse martingale argument shows that in 
fact {x\ k }f = i are independent given A^Li a \::2o- By a standard martingale 

argument, {x\ fc }™ =1 are independent given T = V^Li (A^=i a )'"to) ■ Since 
this is true for any k,n > 1, the assertion follows. □ 



We obtain the following corollary: 

Lemma 8.2. Let p denote a joining of full entropy o/{^j}^ 1 , and denote 



oo 



by £ = A a i"'So the toil cr-algebra of the stochastic process {x 1 }. Then C C T. 

a=l 

("The bottom tail is contained in the right tail") 



Proof. Let A be some ^-measurable set. Denote by r the factor map 
corresponding to T. By Lemma 8.1, given t(x) the random variables 
are independent. By Kolmogorov's 0-1 Law, p(A\T) is either or 1. □ 

We cite the following technical lemma: 

Lemma 8.3. Let (X, T, T>, v) denote a measure preserving system, and 
suppose that (3 and 7 are two finite partitions of X. Denote by Xy the tail 
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a-algebra corresponding to 7 and v. Then for all k > 1, 

D(0i...k\Pk+i...oo V Tt) = D(j3 lm .. k \p k+1 . mm0O ) v - a.e. 

The same applies if instead of Xy one takes a limit of an increasing sequence 
of tails of finite partitions. 



Proof See [17, Cor. 5.28, p. 99], or [16, Lemma 7, p. 65]. 



□ 



We also need the following monotonicity property of conditional entropy, 
which follows from Jensen's inequality. 

Lemma 8.4. Let a, (3, (3', 7 denote partitions in a measure preserving 
system (X, T, T>, v), and suppose that 7 < f3 < ft'. Then 



E H v (a\f3) 



7)(x) >E^H v (a\/3') 7 )(x) i/-a.e. 



The same applies if (3 and 7 are limits of increasing sequences of partitions. 

Proposition 8.5. Let p, be a joining of full entropy, k G N, and G a 
subgroup of7hjp k 7L. Then for a.e. x, 



r)(*) 



E \H^ei„ k mod G|0£ +1 ... c 
is monotone nondecreasing in n. 

Proof. By Lemma 8.3, we have /2-a.e. 
E(tf A (^.. fc modG|(9J? +1 ... 00 )|r)(x) 

= E (fl" A ((9?... fc modG|^ + i...ooVT) 

Now apply 6.2 in each fiber. 

We leave the proof of the following estimate to the reader. 



T)(x) 



□ 



Lemma 8.6. Let X be a random variable such that < X < M and 
EX > 77. Then 

v 2' ~ 2M 

COROLLARY 8.7. If X is as in Lemma 8.6, g a monotone nondecreasing 
function, then 

7/5(77/2) 



E(<?PO)> 



2M 
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The last lemma we need is the following variant of Shannon's theorem. 
Denote by (n\F) the measure /x conditioned on F, i.e., 

^\F)(A)=n(AnF)/^F). 

Lemma 8.8. Let (3 be a finite partition in an ergodic measure preserving 
system (X,fi,T), and denote fii...k = V . . . V T _ ( fe_1 )/3. Suppose that for 
some sequence of subsets {-F^} and a constant 7 > 0, for every k large enough 
fi(F k ) > 7. Then 

h„(J3,T) >Iimsup^| Ffc (/3i... fc ). 

k—>00 rt 

Proof. Let e > 0, and denote by TV the number of atoms in (3. By 
Shannon's theorem, for k large enough it is possible to cover a subset of X of 
^-measure 1 — by at most exp^/c(/i At (/3, T) +e)j atoms of In particular, 

the //|-Ffc-measure of the union of these atoms is at least 1 — e. Hence by a 
standard property of entropy (see Rudolph [17, Cor. 5.17]) 

^|F fc (A...fc) < I ■ (H(e, l-e) + k{h^(3,T)+e) + eklogN). 
Letting k — > 00 and e — ► proves the lemma. □ 

Proof of Theorem 1.8. Arguing as in Lemma 2.1, the ergodic components 
of fi are also joinings of full entropy of {m}. Since by Rokhlin's theorem the 
entropy of a measure is the average of the entropies of its ergodic components, 
it is enough to consider the case where fi is ergodic. 

By Proposition 8.5, h(Q n fl, o~ p ) is monotone nondecreasing in n. Define 

h d = lim h(Q n fl, a p ) = sup h(@ n fl, o~ p ). 

We need to show that h = log p. 

Define conditional measures fii ~ Dp,(a\ mmm00 \T). The sequence {jli} is a 
random sequence of measures, depending on a choice of fiber (a point 
in T). By Lemma 8.1, {fii} are independent. We claim that the assump- 
tions of Lemmas 7.2-7.3 hold for {jli} with positive probability. To see this, 
consider the random variables 

Zi = e(V 1 (#m 1 KI4+i...co)) \ t) 

Wi = ^(H fli (ai\4 + i...oo) T 



To prove that (18) holds with positive probability, we need to show that 

Hi,a p ) > 0. B3 

W*l>-\Wi/2) 



¥(J2 Zi = 00) > 0. Let n d = infj h(m, a p ) > 0. By Corollary 8.7 



(26) Z, > 



21ogp 
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Since E (Wj) = h(/j,i, a p ) > rj, by Lemma 8.6 and (26), 



\ 41ogp J 21ogp 



Hence P(£ ^ = oo) > P(Z< > i.o.) > r/* >0. 

Fix e G (0, 77*), and choose m with h — h(Q m fi, a p ) < e 2 . For any k, define 



T[ n) (i) = r E ( ff AW...fcl«2+i...oo) 



T fc (t) = supTW(t). 



fc 

By Proposition 8.5, T^\t) is almost-surely monotonically increasing in n. 
Thus 

< E (T fc - T< m) ) = h - h(Q m fi, <j p ) < e 2 ; 
hence by Markov's inequality 

_ [Tfc _ T M > £] < £ = £ 

Define F fc = {£ Z t = 00} \ {T fc - r[, m) > e}. The event {£ ^ = 00} is T- 
measurable by Lemma 8.2; hence is T-measurable as well. Clearly fi{Fk) > 
n* — e. Applying Lemma 7.3, we have (denoting £ = £{k) = L^T^f J) 

for all t € F k , Hjx{6™ j\T)(t) > (£ - 1) logp - (k + l)e. 

Thus by Lemma 8.4 

Hn\F k (0^. e ) = E^(eZA{Fk,F c k }) |F fc ) >(^-l)logp-(fe + l)e. 

Dividing by £ and applying Lemma 8.8 we get 

h(Q m fi,a p ) >\ogp-^e. 

logp* 

Letting e j proves the theorem. □ 



9. Dimension of sum sets 

For any measure fi, let 

(27) dim ji = f inf{dim// S | S'isaBorelsetwith/j(5) = 1}. 
Define the lower dimension of fj, by 

(28) dim// = f inf{dim# S | SisaBorelsetwith/i(S') > 0}. 
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To compute dimension of measures we use the following lemma from 
Billingsley [1]. 

Billingsley's Lemma. Let fi be a positive finite measure on T. Assume 
K C T is a Borel set satisfying fJ,(K) > and 

KclxeT: limmf l0g ^ )] < 7 }. 
I eio loge ~~ J 

Then dim// K < 7. If the liminf is 7 a.e., then dim// K = 7. 

Here B £ (x) can be the interval of length e centered at x, or the mesh 
interval with edge e containing x, etc. 

In most of this section we restrict attention to p-invariant measures on T, 
for some fixed integer p > 1. In this case, by the Shannon-McMillan- Breiman 
(SMB) Theorem, it follows that if \x is ergodic then dim/t = h(/j,, a p )/\ogp. If 
\i is not ergodic, denote by \i = f \iq dO its ergodic decomposition, and then 

(29) dim /t = ess sup e dim \x$. 

This (known) fact is proved in Meiri and Peres [15] in a more general context. 
We wish to derive the equivalent statement for lower dimension. 

Theorem 9.1. Let n be a p-invariant measure on T, and denote its 
ergodic decomposition by \i = J fig dO. Then dim/ / = essinf# dim^e- 



Proof. Denote 7 = ess inf dim \i$ . By the above remarks, 
7 = ess inf h(fig, a p )j log p. 

Let ip(x) denote the SMB function of //, i.e., tp(x) = f lim ^I a n-i (x) for the 

partition a = {[|, ^jp]}j=o (see Meiri and Peres [15] for more details). The 
function tp is constant on fibers, and the SMB theorem for nonergodic trans- 
formations implies that for almost every we have ip(x) = h(fig,a p ) n$-a.e. 
(cf. Parry [16, p. 39]). Fix some e > 0, and let 

B={x: p±< 1 + e). 
I logp J 

From the definition of 7 we have fi(B) > 0, hence dim // < dim^j B. From 
Billingsley's lemma and the SMB theorem we know that dim// B < 7+e. Since 
e was arbitrary, we conclude that dim/x < 7. The other direction is proved 
similarly: if d_im/i < 7, then there exists a Borel set B and some e > such 
that /t(-B) > and dim// S < 7 — e. Applying once again Billingsley's lemma 
and the SMB theorem we get i/j(x)/logp < 7 — e for 168, contradicting the 
definition of 7. □ 
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In particular, if 11 is p-invariant and ergodic, we conclude that 

(30) dim/i = dim/i= h ^°r\ 

logp 

Lemma 9.2. Any two measures fx and v on T satisfy dim(/i * v) > dim/i. 

Proof. Suppose that B is a Borel set with dim// B < dim/ x. We need to 
prove that (fi * U ){B) = 0. For any t G T we have dim//(f? — t) = dim// B < 
dim u. so - i) = 0. Hence (// * = / fj,(B - t) du(t) = 0. □ 

Corollary 9.3. Let n and v he p-invariant measures on T, with \i 
ergodic. Let fx * v = f ipg d6 be the the ergodic decomposition of [X* v. Then 
h(ipQ,a p ) > h(fj,,a p ) for almost-every 6. 

Proof. Write 

ess inf ^f 6 ' 1 — e! = dim(u * u) by Theorem 9.1 
logp V ^ ; 

> dimp by Lemma 9.2 

= ^1 by (30). 

logp 

This proves the assertion. □ 

The point in the last corollary is that [i*v need not, in general, be ergodic, 
even if [i and v are ergodic. Furthermore, the entropy of its ergodic components 
need not be equal, as seen in the following example: 

Example 9.4. A nonergodic convolution. 

oo 

Take p = 2 and let X = x iV~ % denote the random variable on T for 

i=i 

which Xi = if i ^ (mod 5), and otherwise is or 1 with probability 
i. Denote by ji the distribution of X + pX + • • • + p 4 X (mod 1). Then 
(T,fi,a p ) is ergodic but not weakly-mixing (since a p is not ergodic). Also, p* 5 
is not ergodic, and decomposes to (finitely many) components with different 
entropies (one of them is Lebesgue). It is also possible to construct an example 
with a continuum of components. 

We turn now to topological corollaries of Theorems 1.1 and 1.8. 

Proof of Corollary 1.2. By the variational principle for expansive maps 
(see Walters [19]), for every % there exists a p-invariant and ergodic measure 
Hi supported on Si with h(fii,a p ) = h top (Si,a p ). Recall that 



h top (Si,a p )/ log p = dim H Si; 
h{ni,a p )/logp satisfy (1). 



hence dim/Xj = h{ni,o~ p )/ log p = dim// Si by (30). By our assumptions, hi = f 
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Denote = //!*■••* fin- Since v^ a \S\ + • • • + S n ) = 1, we have 
dimz» < dimniS! + ■ ■ ■ + S n ), 

so it is enough to show that dimz/( ra ) — > 1. Denote by = J v^ 1 d9 the 
ergodic decomposition of . Then 

dimz/ n ) = ess sup { dim z^" - -*} by (29) 

-esssup/i(fg n ' ) ,crp) by (30) 



logp 
logp J 



logp 

h{v^ n \a v ) by Rokhlin's theorem 

logp 1 v ' J 

-> 1. by Theorem 1.1 □ 



Proof of Theorem 1.3. We define inductively a joining of full entropy 
/}. Let i/ 1 ) = /xi. Consider the ergodic decomposition of fxi x ^2- Since 
Mi x ^2{Si x S2) > 0, by Lemma 2.1 we can find an ergodic component 
such that 

(i) h(vW,a p x dp) = h(fii,a p ) + h(n2,<T p ), 

(ii) z/( 2 ) projects to /ii and /X2, and 

(iii) u^(S 1 xS 2 )>0. 

Consider next the ergodic decomposition of v^> x ^3, and find an ergodic 
component with similar properties. Continue in this manner to define a 
sequence of measures such that is an ergodic joining of full entropy 
of [ii, ... , /i n , and f( n \Si x • • • x S n ) > 0. Let jx be the inverse limit of this 
system. Then, by Theorem 1.8, h(Q n v^ n \a p ) — ► logp. As v^> is ergodic, 
9V(n) ig ergodic as well . Since (9 n z»)(Si + • • • + S n ) > 0, by (27) and (30) 
we conclude that 

dimH{Sl + ... + Sn )> dimevw = MeVw) ' ap) — 1, 

logp 

as stated. □ 



10. Examples and questions 



1. Recall that a measure fi on T is {c n }-normal a.e. if {c n x (mod 1)} 
is uniformly-distributed for p-almost every x £ T. Suppose that \i is a p- 
invariant ergodic measure, such that [i * fi is {g n }-normal. Does it follow that 
H is {g n }-normal as well? For noninvariant measures, the assertion is false: 
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Example 10.1. fi * fi normal does not imply that \i is normal. 

Let {Xi} and {Yi} be independent sequences of random bits with P(Aj = 
0) = \ and F(Yi = 0) = 1/i for all i > 1. For 2 fc < i < 2 k+1 define Z t = XiY k . 
Let /U be the distribution of Y^2~' l Zi. Then fi is {2™}-normal in probability, 
but not {2 n }-normal a.e. Also, ji* n is {2 n }-normal a.e. 

2. Suppose that \x is a p-invariant and ergodic measure on T with positive 
entropy. Does it follow that dim(/i* n ) — > 1 as n — > oo? This would mean 
that the convergence h(jx* n ,a p ) — > logp is uniform on all ergodic components 
of 

3. As we remarked in the introduction, the entropy condition in Theo- 
rem 1.1 is sharp. 

Example 10.2. Given numbers < hi < 1 with — — l -— < oo, we con- 

^|log/i;| 

struct a sequence of p-invariant ergodic measures {/ij} such that /i(/ij, <r p )/ logp 
= hi, yet /ii * • • • * fjL n -/-^ A weak*. 

For a sequence {Pi}, let xj^ be independent random variables on 
{0, . . . ,p - 1} with P(xf = 0) = 1 - Pi, and ¥{xf = k) = p t /(p - 1) 
for k = 1, . . . ,p — 1. Define /Uj to be the distribution of J2 Xyp~K Then {//j} 

3=1 

are Bernoulli measures, and 

h(fii,a p ) = H(l - (3i, -^—) ~ & log \ . 

p-1 p-l Pi 

Define Pi by requiring that h(^i,a p ) = hilogp. Then from the condition on 
{hi} we get X) Pi < oo (see §6). It is not hard to see that |/Uj(l) — 1| < 47r/3j. 
Thus — 1| < oo. But then 

n oo 

i (Aii * ■ • • * ^r(i)i = i n ^wi ^ n iAi(!)i > o. 

i=i i=i 

whence (/xi * • • • * ^n) A (l) 7^ 0- 

4. Is the dimension condition of Corollary 1.2 sharp as well? Specifically, 
given a sequence of numbers < di < 1 such that di/\ log di\ < oo, can 
one always find p-invariant closed subsets Si C T with dim// Si = d, L and 
limdim^Si + • • • + < 1? Currently we can construct sets satisfying the 
desired conclusion, but only when J2di/\ logdj| is small enough. 

Example 10.3. Given numbers < di < 1 with ^ j - < c (p) (f° r 

some c(p) that can be made explicit) there is a sequence of p-invariant closed 
sets Si such that dim// Si = di, yet limdim//(S'i + • • • + S n ) < 1. 
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Indeed, consider the set 

S(N, 0) = [x G T : Vn G N, (# non-0 digits in x n ... n+A r_i) < /3Jv}. 

If is constructed in the same way the measures \ii were defined in the 
previous subsection, then clearly 

dim// b (N, (3) = — ► — as TV — > oo. 

logp logp 

Thus we can chose Ni and so that if Si = S(Ni, fy), then 

(31) di < dimH Si < (1 + e)di, and 

(1-*)* < k -^<{l + e)d l . 
logp 

We also assume Ni\Ni + \ and iVj > 2 l je for all i. The exact condition we need is 
that A < 1 — e ; which can be translated to a condition on X)£o 1°S 
via (31). 

We shall presently show that lim dim// (Si + • • • + S n ) < 1. It is possible to 
replace the sets Si by S^ C such that dim// S^ = di , using the fact that the 
Si's we defined above are shifts of finite type. Shifts of finite type have many 
closed p-invariant subsets, and in particular have a closed p-invariant subset 
with any Hausdorff dimension between and the dimension of the full set. 
Thus there is a closed p-invariant subset S 1 - C Si with the required Hausdorff 
dimension (properties of shifts of finite type are discussed in Denker et al. [5]; 
the above result is a consequence of the Jewett-Krieger theorem, (chap. 29 in 

that reference)). Clearly lim dim// (S^H \-S' n ) < lim dim// ( Si -| \-S n ). The 

proof that lim dim// (Si H h S n ) < 1 follows from the following proposition: 

PROPOSITION 10.4. IfN\M, 

S(N, (3) + S(M, /?') c S(M, (3 + [3' + ±). 

Proof. Consider the M-block z n ... n+M -i for any z G S(N, (3) + S(M, (3') 
as an element of Z/p M Z. Then there are x G S(N, (3) and x' G S(M, (3') such 
that 

X n ...n+M-1 + x'n...n+M-l mod p M < Z n ... n+ M-1 

< (%n...n+M— 1 .n+M-l m °d P M ) + p — 1- 

Note that for any a, b G N, their base-p expansions satisfy 
(32) 

(# non-0 digits in a) + (# non-0 digits in b) > (# non-0 digits in a + b). 
This can be shown, for example, by induction. 
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Now in x n ,,, n+ M-i there are at most (3M non-0 digits, and in x' n n+M _i 
at most (3'M non-0 digits. Thus using (32), z n ,,, n+ M-i can have at most 
((3 + (3')M + 1 non-0 digits, and the proposition follows. □ 

Using the above proposition, we see that for any n, 

Si + ... + S„cS(JV n ,e + ££i&)- 

As e + Pi < 1, we see that 

Mm dMSl+ ... + S ,)<M+S 1 M < , 

n^oo logp 

5. Does Theorem A of Section 1.1 hold under the weaker assumption 
that the collision exponent is smaller than two? By the Bootstrap Lemma, a 
positive answer to the first question in this section would give a positive answer 
to this question. In particular, this would imply that every p-invariant measure 
of positive entropy is {q n }-normal, for any p, q such that some prime factor of 
p does not divide q. The case where fx is a Bernoulli measure is covered by 
Feldman and Smorodinsky [7], for any multiplicatively-independent p,q. 

6. The following example shows that Theorem 1.8 is not valid under the 
weaker assumptions of Theorem 1.1. In fact, even weak convergence is not 
guaranteed: 

Example 10.5. A joining of full entropy a with — — - — - = oo, yet 

^ log//,, | 

G n /2 A weak*. 

Take p = 2, and fix some irrational a. Let to be a uniform random 
variable on the unit interval, and define t n = to + na (mod 1). Let {Y?}fj =l 
be a sequence of i.i.d. variables with f(Y- } = 0) = P(Y/ = 1) = \. Given a 
nonnegative sequence {hj}, define 

j _ f if U > hj, 
1 \ y/ Zt i <h j 

and let A-? = Denote by fj,j the distribution of A J , and by ft the 

i=i 

distribution of {Xi} c *L l on T N . Then \ij is a ^-invariant and ergodic measure 
with entropy hj (in base 2), and jl is a joining of full entropy. Take hj = j^. 
We claim that for small enough a we have Q k jl -/-^ A weak*. To see that, 
define the following events: 

An = GNVz = 1,...,N X{ = 0}, 
B N = {Vi > N'ij > 2 i/2 Xj = 0}. 

oo 

If U > hi then X{ = for all j. Take such that £ 2~ 1 / 2 < |, and fix 

i=N+l 

a € (0, ^). Clearly F(A N ) > P(t, > h ± Vi = 1, . . . , AT) > F(h G (i, |)) = |. 
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Also, F(B C N ) < P(3i > N,U < 2~ i /' 2 ), since if U > 2- { l 2 for all i > N, 
then for all j > 2 1 / 2 we have ti > i > hj; hence X\ = 0. By our choice of N, 

oo OO -. 

nm< E p(^<2- i/2 )= e 2-/ 2 <-, 

i=AT+l i=AT+l 

and we conclude that P^nBjv) > P(Ajv)-P(-B^) > §. For (X/) G ^jvOBjv, 

for any the first N/4 digits of the sum X 1 -\ h X fc (mod 1) are zero, and 

so Q k fi -^-> A weak*. 
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